BIOTECHNOLOGY JSD m-S 



RAW SEQUENCE LISTING 
ERROR REPORT 




The Biotechnology Systems Branch of the Scientific and Technical Information 
Center (STIC) detected errors when processing the following computer readable 
form: 

Application Serial Number: O^ffVff 9tO 

Source: € I P^ 



Date Processed by STIC: 




THE ATTACHED PRINTOUT EXPLAINS DETECTED ERRORS. 

PLEASE FORWARD THIS INFORMATION TO THE APPLICANT BY EITHER: 

1) INCLUDING A COPY OF THIS PRINTOUT IN YOUR NEXT COMMUNICATION TO THE 
APPLICANT, WITH A NOTICE TO COMPLY or, 

2) TELEPHONING APPLICANT AND FAXING A COPY OF THIS PRINTOUT, WITH A 
NOTICE TO COMPLY 

FOR CRF SUBMISSION QUESTIONS, PLEASE CONTACT MARK SPENCER, 703-308-4212. 

FOR SEQUENCE RULES INTERPRETATION, PLEASE CONTACT ROBERT WAX, 703-308-4216. 
PATENTIN 2.1 e-mail help: natin2 1 helo@uspto.gov or phone 703-306-4119 (R. Wax) 
PATENTIN 3.0 e-mail help: patin3heip(a>,uspto.gov or phone 703-306-4119 (R. Wax) 

TO REDUCE ERRORED SEQUENCE LISTINGS, PLEASE USE THE CHECKER 
VERSION 3.0 PROGRAM . ACCESSIBLE THROUGH THE U.S. PATENT AND 
TRADEMARK OFFICE WEBSITE. SEE BELOW: 



Checker Version 3.0 

The Checker Version 3.0 application is a state-of the-art Windows based software program 
employing a logical and intuitive user-interface to check whether a sequence listing is in 
compliance with format and content rules. Checker Version 3.0 works for sequence listings 
generated for the original version of 37 CFR §§1.821 - 1.825 effective October 1, 1990 (old 
rules) and the revised version (new rules) effective July 1, 1998 as well as World Intellectual 
Property Organization (WIPO) Standard ST. 25 . 

Checker Version 3.0 replaces the previous DOS-based version of Checker, and is Y2K- 
compliant. Checker allows public users to check sequence listings in Computer Readable form 
(CRF) before submitting them to the United States Patent and Trademark Office (USPTO). 
Use of Checker prior to filing the sequence listing is expected to result in fewer errored sequence 
listings, thus saving time and money. 



Checker Version 3.0 can be down loaded from the USPTO website at the followine address: 

http://www.uspto.gov/web/offices/pac/checker 
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Raw Sequence Listing Error Summary 



ERROR DETECTED SUGGESTED CORRECTION SERIAL NUMBER: Of/<f V% ? %Q 

ATTN: N#\V RULES CASES: PLEASE DISREGARD ENGLISH "ALPHA" HEADERS, WHICH WERE INSERTED BY PTO SOFTWARE 



1 Wrapped Nucleics The number/text at the end of each line "wrapped" down to the next line. This may occur if your file 
Wrapped Aminos was retrieved in a word processor after creating it. Please adjust your right margin to .3; this will 
prevent "wrapping." 

2 Invalid Line Length The rules require that a line not exceed 72 characters in length. This includes white spaces 

Misaligned Amino The numbering under each 5* amino acid is misaligned. Do not use tab codes between numbers' 
Numbering use space characters, instead. 

_Non-ASCII The submitted file was not saved in ASCII(DOS) text, as required by the Sequence Rules. Please 

ensure your subsequent submission is saved in ASCII texL 

5 Variable Length Sequcnce(s) contain n's or Xaa's representing more than one residue. Per Sequence Rules, 

each n or Xaa can only represent a single residue. Please present the maximum number of each 
residue having variable length and indicate in the <220>-<223> section that some may be missing. 

6 Patentln 2.0 A "bug" in Patentln version 2.0 has caused the <220>-<223> section to be missing from amino acid 

"bug" sequencers) . Normally, Patentln would automatically generate this section from the 

previously coded nucleic acid sequence. Please manually copy the relevant <220>-<223> section to 
the subsequent amino acid sequence. This applies to the mandatory <220>-<223> sections for 
Artificial or Unknown sequences. 

7 Skipped Sequences Sequences) missing. If intentional, please insert the following lines for each skipped sequence* 

(OLD RULES) (2) INFORMATION FOR SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 

(i) SEQUENCE CHARACTERISTICS: (Do not insert any subheadings under this heading) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:X: (insert SEQ ID NO where "X" is shown) 
This sequence is intentionally skipped 

Please also adjust the "(ii) NUMBER OF SEQUENCES:" response to include the skipped sequences. 

8 Skipped Sequences Sequences) missing. If intentional, please insert the following lines for each skipped sequence 

(NEW RULES) <210> sequence id number 
<400> sequence id number 
000 



Use of n*s and/or Xaa's have been detected in the Sequence Listing. 

Per 1.823 of Sequence Rules, use of <220>-<223> is MANDATORY if n's or Xaa's are present. 
In <220> to <223> section, please explain location of n or Xaa, and which residue n or Xaa represents. 

Per 1.823 of Sequence Rules, the only valid <21 3> responses are: Unknown, Artificial Sequence, or 
scientific name (Genus/species). <220>-<223> section is required when <2 13> response is Unknown or 
is Artificial Sequence 

Sequences) missing the <220> "Feature" and associated numeric identifiers and responses. 

Use of <220> to <223> is MANDATORY if <21 3> "Organism" response is "Artificial Sequence" or 
"Unknown." Please explain source of genetic material in <220> to <223> section. 
(See "Federal Register," 06701/1998, Vol. 63, No. 104, pp. 29631-32) (Sec. 1.823 of Sequence Rules) 

.Patentln 2.0 Please do not use "Copy to Disk" function of Patentln version 2.0. This causes a corrupted file, 

"bug" resulting in missing mandatory numeric identifiers and responses (as indicated on raw sequence 

listing). Instead, please use "File Manager" or any other manual means to copy file to floppy disk. 

AMC - Biotechnology Systems Branch - 06704/200 1 



9 Use of n's or Xaa's 

(NEW RULES) 

10 Invalid <2I3> 

Response 



11 _Use of<220> 



sequence 

DlDaiiDa □ I □ K □ J) 

yyy h wmw 



yy^yyyyyyyyyyyyyyyyyyyyyy 

mww999mmmmnv& 7 □ an i □ □ Q& □ bjbjunua 

□□ "L 7| 7| Q" yy D 

yy° yyD l ae ae ae ae 

ae ee as □ bD 2 D 2 D 2 D *D D 

< bD ID >D DD DD DD DD DD DD 

□□ □□ ND D PD PD PD PD PD P 

□ $□□□- j tD D ae □□ 

□□ □□ DD DD tD 

2 ° ae DD DD VoD □ ZD ZD 

ZD DD <D ae DD ae_ DD ND 2D 

□□ ND 2D 

A 2D ND ae 33 

ND DD uD D A'EwOxAObD PD 

2 D BD L ND ' ■ NO YD 0 XD ND 

2D - ND ZD v u < 6D , ae ae ae 

ae D U SEQUENCE LISTINGD(l) General Inf ormationDD ( i ) APPL 

I CANT : SANTEN PHARMACEUTICAL CO., LTD .D ( ii ) TITLE OF INVENTION: Novel 
Polypeptide Having Water Channel DActivity and DNA sequenceD (iii ) NUMB 
ER OF SEQUENCES: 2D(iv) CORRESPONDENCE ADDRESS: D (A) ADDRESSEE: SANT 
EN PHARMACEUTICAL CO., LTD.D (B) STREET: 9 ? 19 Shimoshinjo 3-chome Hi 
gashiyodogawa-Ku D (C) CITY: Osaka D (D) STATE: OsakaO (E) COUNTRY: J 
APAND (F) ZIP : 533-0021D(v) COMPUTER READABLE FORM: D (A) MEDIUM TYPE: 
Diskette, 3.5 inch, 1.44 MB, storage D (B) C 

OMPUTER: IBM PS/2 or compatiblesD (C) OPERATING SYSTEM: WINDOWS 95/97D 

(D) SOFTWARE: Microsoft Word 97D(vi) CURRENT APPLICATION DATE : D (A) 
APPLICATION NUMBER: 09/381, 810D (B) FILING DATE: 19-OCT-1999D (C) CLASSI 
FICATION: 435D(vii) PRIOR APPLICATION DATED (A) APPLICATION NUMBER: JP 
09 094845D (B) FILING DATE: 28-MAR-1997D (viii ) ATTORNEY/AGENT INFORMAT 
ION:D (A) NAME: Burton A. AmernickO (B) REGISTRATION NUMBER: 24852D (C 
) REFERENCE/DOCKET NUMBER: 158 1/00156D ( ix) TELECOMMUNICATION INFORMAT I 
ON:D (A) TELEPHONE: (202) 331-7111D. (B) FAX: (202) 293-6229D D(2) INFOR 
MAT I ON FOR SEQ ID NO: 1:1X1 (i) SEQUENCE CHARACTERISTICS^ (A) LENGTHDF3 
42 amino acidsD (B) TYPE: amino acidD (D) TOPOLOGYDF linearD(ii) MOREC 
ULE TYPEDF peptideD D(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:D DMet" V 
al Gin Ala Ser Gly His Arg Arg Ser Thr Arg Gly Ser Lys MetD D D 

5 10 15DVal Ser Trp Ser Val I 

le Ala Lys He Gin Glu He Leu Gin Arg LysD 20 

25 30DMet Val Arg Glu 

Phe Leu Ala Glu Phe Met Ser Thr Tyr Val Met MetD 35 

40 45DVal Phe Gly Leu Gly Ser Val Ala His Met 

Val Leu Asn Lys Lys TyrD 50 . 55 6 
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/ Rules* and Regulations 



(3) Computer Apple Macintosh; 
(i) Operating System: Macintosh: 

-jet- 4 i{ i h£ acln to«h: ril iJyv e: text wIlh lin * 

"* termination ~ 

(ifl) Une Terminator Pre-defined by 
text type file; 
r ~ (iv) Pagination: Predefined by text 
type file; . 

(v) End-of-file: Pre-defined by text 
type file; 

(vij Media: (A) DiakeU-4.50 Inch. 400 
Kb storage; 

(B) Diskette — 3.50 Inch, 800 Kb 
storage; 

(C) Diskette— 3,50 Inch, 1,4 Mb 
storage; 

(viij Print Command: Uee PRINT . 
command from any Macintosh . 
Application that processes text files, 
such a 8 MacWrite or Teach text; 

(4) Magnetic tape: 0.5 Inch, up to 2400 
feel; 

[i) Density: 1600 or 6250 bits per inch, 
9 track; 

(ii) Format: /aw, unblocked; 

(lit) Line Terminator ASCII Carriage 
Return plus optional ASCII Line Feed; 

(iv) Pagination: ASCII Form Feed or 
Series of Line Terminators; 

(v) Print Command [Unix shell version 
given here as sample response — mt/ 
dev/rmtO;'lpr/dey/rmtO): 

(g) Computer readable forma that are 
submitted to the Office will not be 
returned to the applicant. 

•(h) All computer readable forms shall 
have a label^rmahenlly affixed thereto 
ohvhlch has Seen'hia^d printed or 
typed, a description of the format of the 
computer readable form as well as the 
name of the applicant, the title of the 
Invention, the date on which the data 
were recorded on the computer readable 
form and the name and type of Computer 
and operating syB tern which generated 
the files on the computer readable form. 
If all of this Information cannot be 
printed oh a label affixed to the 
oompater readable form, by reason of 
size or otherwise, the label shall Include 
the name of the applicant and the title of 
the Invention and a reference number, 
and the additional Information may be 
provided on a container for the 
computer readable form" with the name 
of the applicant, the title or the 
Invention, the reference number and the 
additional Information affixed to the 
container. If the computer readable form 
Is submitted after the date of filing 



under 35 U.S.C. 111, after the date of 
entry in the national stage under 35 
LLS.C.-371«r after -the4ime of filing; In 
the United States Receiving OfficeTan" 
International application unde'r'the PCT, 
the labels mentioned herein must also 
Include the date of the application-end 
the application number, including series 
code and serial number. 



f 1JB25 Am*odm*nU lo or r*pUc*m«nt of 
•*qo*oc* isting and computer rvadablt 
copy th*c»of. 

(a) Any amendment to the paper copy 
of the "Sequence Listing" (H.B21(c)) 
must be made by the submission of 
substitute sheets. Amendments must be 
accompanied by a statement that 
Indicates support for the amendment In 
the application, ea filed, end a statement 
that tho substitute sheets Include no 
new matter. Such a statement must be. a 
verified statement if made by a person 
not registered to practice before the 
Office. 

(b) Any amendment to the paper copy 
of the "Sequence Listing," In accordance 
with paragraph (a) of this section, must 
be accompanied by a substitute copy of 
the computer readable form (5 1.821(e)) 
Including all previously submitted data 
with the amendment incorporated 
therein, accompanied by a statement . 
that the copy In computer readable form 
is the same as the substitute copy of the 
"Sequence Listing." Such a statement 
must be a verified statement if made by 
a person not registered to practice 
before the Office. 

. (c) Any appropriate amendments to 
the "Sequence listing" In a patent, e.g., 
by reason of reissue or certificate of 
correction, must comply with the 
requirements of paragraphs (a) and (b) 
of this section. 

(d) If, upon receipt, the computer 
readable form Is found lo bo damaged or 
unreadable, applicant must provide, 
within such time as set by the 
Commissioner, a substitute copy of the 
data In computer readable form 
accompanied by a statement that the 
substitute data Is Identical to that 
originally filed. Such a statement must 
be a verified statement if made by a 
person not registered to practice before 
" 'Jica- : ■ 

Appendix A — Sample Sequence Listing 
(1) GENERAL INFORMATION. 




(1) APPLICANT: Doe; Joan X. Doe. John Q 
(II) TITLE OF INVENTION: Isolation and 

Characterization of a Cene Encoding ■ 

.l-L„.Protea~se from Paramecium ip. 
(ilij NUMBER OF SEQUENCES: 2 
(Iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: -Smith and Joaei 

(B) STREET: 123 Main Street 
(Cj'Cmr: Smalltown 
(D) STATE: Aoyttate' 
(EJ COUNTRY: USA 
(F) ZIPi'lZMS ^ 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Dlikette. 3.50 inch. 600 
Kb storage 

(B) COMPUTER: Apple Macintosh 
(Cj OPERATING SYSTEM: Mdntoah S.0 
(D) SOFTWARE; MacWrito 

(vi) CURRENT APPLICATION DATA 

(A) APPLICATION NUMBER: 00/999.999 

(B) FlUNC DATE: 2S-FEB-1 &C9 

(C) CLASSIFICATION: 093/99 
(vli) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT/US8fl/ 
99999. . 

(B) PILING DATE: 01 -MAR- 1908 

(vili) ATTORNEY/ ACENT INFORMATION: 

(A) NAME: Smith. John A. 

(B) REGISTRATION NUMBER: 00001 

(C) REFERENCE/DOCKET NUMBER: 01- 
0001 

(ix) TELECOMMUNICATION 
INFORMATION: 
(A) TELEPHONE: (909) 999-0001 

(D) TELEFAX: (909} 999-0002 

(2) INFORMATION FOR SEQ ID NO: 1: 

(I) SEQUENCE CHARACTERISTICS: 

(A) LENGTH:. 954 bate pain 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: Unaar 

(II) MOLECULE TYPE: genomic DNA 
(IK) HYPOTHETICAL: yei 

(I v) ANTI-SENSE: ho 

(vl) ORIGINAL SOURCE: 
(A) ORGANISM: Paramecium ap 
(CJ INDIVIDUAL/ISOLATE: XYZ2 
(G) CELL TYPE: unicellular organism 

(vli) IMMEDIATE SOURCE: 

(A) LIBRARY: genomic 

(B) CLONE: Para-XYZ2/M 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: Doo. foan X. Doe. John Q 

(B) TITLE: Isolation and Character! ration 
of a Cene Encoding » -Pro tease from 
Paramecium ip. 

(C) JOURNAL: Fictional CencB 

(D) VOLUME: I 

(E) ISSUE: 1 

(F) PACES: 1-20 

(G) DATE: 02MAK-1988 

(K) RELEVANT RESIDUES IN SEQ ID NO 
l:FROM 1 TO9S4 

BUXINQ COOt U10-14-W 
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(xi) SEQUENCE DESCRIPTION: SEQ ID-NO: 1: 

ATCGGGATAG TACTGGTCAA GACCGGTGGA CACCGGTTAA CCCCGGTTAA GTACCGGTTA 




60 

7 ;taggccattt caggccaaat gtgccq^ gggcaacgtt "12a 

t^ACGTTCGTAC GCACGTATGT ACCTAGGTAC TTACGGACGT GACTACGGAC ACTTCCGTAC 180 

GTACGTACGT TTACGTACCC ATCCCAACGT AACCACAGTG . TGGTCGCAGT GTCCCAGTGT' . 240 

ACACAGACTG CCAGACATTC TTCACAGACA CCCC ATG ACA CCA CCT GAA CGT CTC* 295 

Met Thr Pro Pro Glu Axg Leu 
-30 

TTC CTC CCA AGG GTG TGT GGC ACC ACC CTA CAC CTC CTC CTT CTG GGG 343 
Phe Leu Pro Arg Val Cys Gly Thr Thr Leu His Leu Leu Leu Leu Glv 
-25 -20 -15 



CTG CTG CTG GTT CTG CTG CCT GGG GCC CAT GTGAGGCAGC AGGAGAATGG 
Leu Leu Leu Val Leu Leu Pro Gly Ala His 
-10 - 5 ^_ 

GGTGGCTCAG CCAAACCTTG AGCCCTAGAG CCCCCCTCAA CTCTGTTCTC CTAG GGG 

Gly 

CTC ATG CAT CTT GCC CAC AGC AAC CTC AAA CCT GCT GCT CAC CTC ATT 
Leu Met His Leu Ala His Ser Asn Leu Lys Pro Ala Ala His Leu He 
1 5 10 15 

GTAAACATCC ACCTGACCTC CCAGACATGT CCCCACCAGC TCTCCTCCTA CCCCTGCCTC 

AGGAACCCAA GCATCCACCC .CTCTCCCCCA ACTTCCCCCA CGCTAAAAAA AACAGAGGGA 

GCCCACTCCT ATGCCTCCCC CTGCCATCCC CCAGGAACTC AGTTGTTCAG TGCCCACTTC 

TAC CCC AGC AAG CAG AAC TCA CTG CTC TGG AGA GCA AAC ACG GAC CGT 
iyr Pro Ser Lys Gin Asn Ser Leu Leu Trp Arg Ala Asn Thr Asp Arq 
20 25 30 

GCC TTC CTC CAG GAT GGT TTC TCC TTG AGC AAC AAT TCT CTC CTG GTC 
Ala Phe Leu Gin Asp Gly Phe Ser Leu Ser Asn Asn Ser Leu Leu Val 
35 AO 45 

TAGAAAAAAT AATTGATTTC AAGACCTTCT CCCCATTCTG CCTCCATTCT GACCATTTCA 
GGGGTCGTCA CCACCTCTCC TTTGGCCATT CCAACAGCTC AAGTCTTCCC TGATCAAGTC 
ACCGGAGCTT TCAAAGAAGG AATTCTAGGC ATCCCAGGGG ACCCACACCT CCCTGAACCA 

WLUMQ COOC 



393 

450 

498 

558 
618 
678 
726 

774 

834 
894 
954 
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^ (2) INFORMATION FOR SEQ ID NO: 2: 
(i) SEQUENCE CHARACTERISTICS: 
(A) LENCTH: 62 atnlno add* 
(BJ TYPE: amino add 

(DJ TOPOLOCY: linear . . .. . 

(il) MOLECULE TYPE: protein — ; ~ - 

(Ix) FEATURE 

(A) NAME/KEY: signal sequence 

(BJ LOCATION: -34 lo>l 



(CJ IDENTIFICATION METHOD: similarity 
to other signal sequences, hydrophobic 

(D) OTHER INFORMATION: expresses 
prole aw 

„(xj PUBLICATION INFORMATION: 
—(A) AUTHORS: Doe. Joan X. Doe, John Q 
(BJ TITLE: Isolation and Characterixatlon 
of a Cene Encoding a Protease from 
Paramecium §p.' 



(CJ JOURNAL Fictional Cenei 
(DJ VOLUME: I 
(E) ISSUE: 1 
(FJ PACES: 1-20 
(CJDATE:C2-MAR.lft8« 
7 (K) RELEVANT RESIDUES JN SEQ ID NO: 
2: FROM TO « 

CCOC KfO-H-H 



Be^ff* ujk^_ x-c^^ct^ - ^<^-, c 



:■ *ttr~ f . 



- i . . . 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 



Met Thr Pro Pro Glu Arg Leu Phe Leu Pro Arg Val" Cys Gly Thr* Thr 
-30 -25 -20 

Leu His Leu Leu Leu Leu Gly Leu Leu Leu Val Leu Leu Pro Gly Ala 
-15 •. -10 -5 

His Gly Leu Met His Leu Ala His Ser Asn Leu Lys Pro Ala Ala His * .* 
1-5 10 

Leu lie Tyr Pro Ser Lys Gin Asn Ser Leu Leu Trp Arg Ala A*n Thr 
15 20 25 30 

Asp Arg Ala Phe Leu Gin Asp Gly Phe Ser Leu Ser Asn Asn Ser Leu 
35 40 AS 

Leu Val 

BILLING COOC JilO-ii-C 



\ 
\ 



